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(54) TROUBLE PROCESSING SYSTEM FOR MULTIPROCESSOR SYSTEM 

(57)Abstract: 

PURPOSE: To avoid system halt by constituting the system so that contents of a cache memory 
of a memory controller where trouble occurs are returned to a main storage device. 
CONSTITUTION: When trouble occurs in a memory controller 3, this occurrence is reported to a 
trouble processing device 5. After gathering trouble information of the memory controller 3 in 
response to this report, the trouble processing device 5 temporarily stops the system to read 
out contents of a data array 31-a and an address array 31-b of a cache memory 31 through a 
bus 103 when confirming it by said trouble information that contents of the cache memory 31 
are reliable. When it is judged by effective bits and rewrite bits read out from the cache memory 
that read data is effective and rewritten data, the trouble processing device 5 requests the write 
of this data to a main storage device 1. 
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FULLY ENGLISH TRANSLATION OF JAPANESE LAID-OPEN PATENT HE I 
02-017550 

Description 
5 1. Title of the Invention 

Trouble processing system for multi-processor systems 
2 . Claims 

(1) A trouble processing system for a multi-processor system, 
comprising: 

10 a first and second memory controllers which have a first 

and second cache memories and a first and second storage 
means for storing effective bits which indicate whether the 
data stored in said first and second cache memories 
respectively is effective , and rewrite bits which indicate 
15 whether said data was rewritten respectively; 

first and second main storage devices; 
first and second trouble processing devices; 
extraction means which extracts data for which said 
effective bits stored in said first storage means indicate 
20 the effectiveness and said rewrite bits indicate the rewrite 
from the data stored in said first cache memory when trouble 
is detected in said first memory controller; and 

reading means which reads out said data extracted by 
said extraction means from said first cache memory , 
25 and wherein said data read out by said reading means is 

written to said first main storage device via said second 
trouble processing device and said second memory controller. 
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3. Description of the Invention 
Technical Field 

The present invention relates to a trouble processing 
5 system for a multi-processor system, and more particularly to 
a trouble processing system for a memory controller which has 
a store- in type cache memory. 
Prior Art 

In a conventional information processing device , 
10 constituting a multi-processor system, if a, store- in type 

cache memory was used for the memory controller, the means of 
rewriting the content of the cache memory to the main storage 
device when trouble occurs in the memory controller was not 
available. 

15 Here a store-in type is to, when a new content of the 

main storage device is required and no open area is available 
in the cache memory, return the content of the cache memory 
to the main storage device to create an open area and write 
the content of the main storage device in that open area. 

20 Since normally a read and write is performed using only the 
content of the cache memory, the content of the cache memory 
and the content of the main storage device are different. 

In this conventional information processing device, even 
if trouble occurred in the memory controller, there was no 

25 means to rewrite the content of the cache memory to the main 
storage device, so trouble occurs to components other than 
the cache memory section of the memory controller, and there 
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is an disadvantage that, even if the content of the cache 
memory is guaranteed , it is impossible to return the content 
to the main storage device and to disconnect the memory 
controller to continue the system operations, therefore a 
5 system shutdown occurs. 
Object of the Invention 

In order to eliminate the foregoing problems, it is an 
object of the present invention to provide a trouble 
processing system which can return the content of the cache 
10 memory of the memory controller, in which the problem 

occurred, to the main storage device, and to prevent a system 
shutdown. 

Configuration of the Invention 

The trouble processing system according to the present 
15 invention is a trouble processing system for a multi- 
processor system, comprising: a first and second memory 
controllers which have a first and second cache memories, and 
a first and second storage means for storing effective bits 
which indicate whether the data stored in the first and 
-20 second cache memories respectively is effective, and rewrite 
bits which indicate whether the data was rewritten 
respectively; first and second main storage devices; first 
and second trouble processing devices; extraction means which 
extracts data for which the effective bits stored in the 
25 first storage means indicates the effectiveness and the 

rewrite bits indicate the rewrite from the data stored in the 
first cache memory when trouble is detected in the first 
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memory controller; and reading means which reads out the data 
extracted by the extraction means from the first cache memory 
are disposed, and the data read out by the reading means is 
written to the first main storage device via the second 
5 trouble processing device and the second memory controller. 
Embodiment 

An embodiment of the present invention will now be 
described by using figures. 

Fig. 1 is a block diagram depicting a configuration of 

10 an embodiment of the present invention. In Fig. 1, the 
multi-processor system according to an embodiment of the 
present invention comprises memory controllers 3 and 4 which 
have main storage devices 1 and 2 and cache memories 31 and 
41 respectively , trouble processing devices 5 and 6, and 

15 arithmetic processing units 7-i (i = 1, . . . ) and 8-j (j =■ 
1 ,•••)• • 

The main storage devices 1 and 2 are normally connected 
with the memory controllers 3 and 4 via the buses 101 and 102 
respectively , but if the memory controller 3 or 4 is disabled 

20 due to trouble, that is, if the memory controller 3 is 

disabled, for example, the main storage devices 1 and 2 are 
connected to the memory controller 4 via the buses 102 and 
201, and if the memory controller 4 is disabled, the main 
storage devices 1 and 2 are connected to the memory 

25 controller 3 via the buses 101 and 202. The memory 

controllers 3 and 4 are inter-connected via the bus 300. 
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The cache memories 31 and 41 of the memory controllers 3 
and 4 are data buffers to access the main memories of the 
arithmetic processing units 7-i and 8-j, the input/output 
processing device which is not illustrated, or the trouble 
5 processing devices 5 and 6, and comprise the store- in type. 

The memory controllers 3 and 4 are connected with the 
trouble processing devices 5 and 6 via the buses 103 and 203 
respectively, and are connected with the arithmetic 
processing units 7-i and 8-j via the buses 104-i and 204- j 
10 respectively. 

The trouble processing devices 5 and 6 perform trouble 
information collection processing and trouble relief 
processing and so on for each device constituting the system. 

Fig. 2 is a diagram depicting reading data from the 
15 cache memory 31 by the trouble processing device 5 in Fig. 1. 
In Fig. 2, the cache memory 31 comprises a read address 
register 31-1 and read data register 31-2, and in the read 
address register 31-1 and the read data register 31-2, a 
chain is constructed in a flip-flop unit respectively. 
20 Therefore the trouble^ processing device 5 can write and read 
data to/from the read address register 31-1 and the read data 
register 31-2 by a scan- in operation and scan-out operation. 

The cache memory 41 also has the same configuration as 
the above mentioned cache memory 31, and writing/reading 
25 to /from the cache memory 41 by the trouble processing device 
6 as well can be performed by a scan-in operation and scan- 
out operation. 
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Fig. 3 is a diagram depicting the configuration of the 
cache memory 31 in Fig. 1. In Fig. 3, the cache memory 31 is 
comprised of a data array 31-a and an address array 31-b. In 
the data array 31 -a, the data is stored in n+1 byte units , 
5 and in the address array 31-b, the memory address, the 

effective bits, which indicate whether the data stored in the 
data array 31-a is effective or not, and the rewrite bits, 
which indicate whether the data stored in the data array 31-ra 
is rewritten, are stored. Each entry, 0 - m, of the data 
10 array 31-a and the address array 31-b correspond to each 
other. 

The cache memory 41 is comprised of a data array and 
address array, just like the cache memory 31. 

Operation of an embodiment of the present invention will 
15 now be described with reference to Fig. 1 to Fig. 3. 

The main storage devices 1 and 2 are connected to the 
memory controllers 3 and 4 respectively by the buses 101 and 
201, and when the arithmetic processing unit 7-i reads or 
writes the main storage device 1, the cache memory 31 of the 
20 memory controller 3 is used, and when the arithmetic 

processing unit 8-j reads or writes the main storage device 2, 
the cache memory 41 of the memory controller 4 is used. 

If the cache memories 31 and 41 did not hit at this time, 
the data is read from the main storage devices 1 and 2 to the 
25 cache memories 31 and 41 in predetermined units, but if there 
is no open area in the cache memories 31 and 41, the content 
in the predetermined units is rewritten to the main storage 
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devices 1 and 2 from the cache memories 31 and 41 before data 
is read from the main storage devices 1 and 2. 

In this way, normal operation is performed and the cache 
memories 31 and 41 are used as the data buffers of the main 
5 storage devices 1 and 2. 

The case when trouble occurs in the memory controller 3 
during the above mentioned normal operation will now be 

i described. It is assumed, however, that the content of the 

cache memory 31 at this time is guaranteed. 

10 When trouble occurs in the memory controller 3, the 

trouble is notified to the trouble processing device 5. 

Responding to this notification, if it is confirmed that 
the content of the cache memory 31 can be guaranteed from the 
trouble information after the trouble processing device 5 

15 collects the trouble information on the memory controller 3, 
the trouble processing device 5 temporarily stops the system, 
and reads the content of the data array 31-a and the address 
array 31-b from the cache memory 31 via the bus 103. 

In other words, the trouble processing device 5 performs 

20 a scan- in operation so that a desired address is- set in the - 
read address register 31-1 of the cache memory 31, and reads 
the data from the desired address of the cache memory 31, and 
sets the data in the read data register 31-2 by applying one 
clock of the machine clock. 

25 The data which is set in the read data register 31-2 is 

read by the trouble processing device 5 via a scan-out 
operation. 
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In this way, if the effective bits and the rewrite bits 
read from the address array 31-b of the cache memory 31 show 
that the read data is effective and is a rewritten data entry , 
the trouble processing device 5 transfers the memory address 
5 of this entry and the data of this entry in the data array 

31-a to the trouble processing device 6 via the bus 301 , and 
requests to write this data to the main storage device 1. 

In this case, the main storage device 1 is connected to 
the memory controller 4 via the bus 102, and memory is 
10 accessed using the cache memory 41, and the memory address of 
this entry and the data in the data array 31-a are rewritten 
to the main storage device 1 via the trouble processing 
device 6 and the memory controller 4. 

In this way, rewrite processing to the main storage 
15 device 1 is performed for the data of all the entries of the 
cache memory 31. Then the system operation is restarted, and 
operation is continued in the configuration of one memory 
controller 4 and two main storage devices 1 and 2. 

When trouble occurs to the memory controller 4 as well, 
20 rewrite processing to the main storage device 2 is performed 
for the data of all the entries of the cache memory 41, just 
like the above mentioned processing, when trouble occurs to 
the memory control device 3. 

In this way, when trouble occurs in the memory 
25 controllers 3 and 4, the data stored in the cache memories 31 
and 41 is read, and if the effective bits and the rewrite 
bits corresponding to this data indicate that the data is 



8 



effective and rewritten, this data is written to the main 
storage devices 1 and 2 via the trouble processing devices 6 
and 5 and the memory controllers 4 and 3, so that the content 
of the cache memories 31 and 41 of the memory controllers 3 
5 and 4 where the trouble occurred can be returned to the main 
storage devices 1 and 2, and the content of the main storage 
devices 1 and 2 can be continuously guaranteed. As a 
consequence , the memory controllers 3 and 4 where trouble 
occurred can be disconnected from the system to continue 

10 operating the system> and a system shutdown can be prevented. 
Effect of the Invention 

As described above, according to the present invention, 
when trouble is detected in a first memory controller of a 
multi-processor system, which comprise a first and second 

15 memory controllers, a first and second main storage devices 
and a first and second trouble processing devices, if the 
effective bits corresponding to the data stored in the first 
cache memory disposed in the first memory controller indicate 
that the data is effective and the rewrite bits indicate that 

20 data is rewritten, the data is read from the first cache 

memory and written to the first main storage device via the 
second trouble processing device and the second memory 
controller, so the content of the cache memory of the memory 
controller where trouble occurred can be returned to the main 

25 storage device, and a system shutdown can be prevented. 
4. Brief Description of the Drawings 
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Fig. 1 is a block diagram depicting a configuration of 
an embodiment of the present invention , Fig. 2 is a diagram 
depicting reading data from the cache memory by the trouble 
processing device in Fig. 1, and Fig. 3 is a diagram 
5 depicting the configuration of the cache memory. 

Description of Reference Numbers of Major Components 
1, 2 main storage device 

3, 4 memory controller 

5, 6 trouble processing device 

10 7-1, 8-1 arithmetic processing unit 
31, 41 cache memory 

31-1 read address register ; 

31-2 read data register 

31-b address array 

15 
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FIG.l: 

1, 2 main storage device 

3, 4 memory controller 

5, 6 trouble processing device 

5 7-1, 8-1 arithmetic processing unit 

31/ 41 cache memory 

FIG. 2: 

5 trouble processing device 

10 31 cache memory 

31-1 read address register 

31-2 read data register 

FIG. 3: 

15 31 -a data array 

31-b address array 
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